基于强化学习的数据驱动最优镇定控制及仿真

doi:10.16451/j.cnki.issn1003-6059.201904008

摘要
图/表
参考文献
相关文章 (6)

全文: PDF (1204 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要利用Q-学习算法,针对模型未知只有数据可用的非线性被控对象,解决最优镇定控制问题.由于状态空间和控制空间的连续性,Q-学习只能以近似的方式实现.因此,文中提出的近似Q-学习算法只能获得一个次优控制器.尽管求得的控制器只是次优,但是仿真研究表明,对于强非线性被控对象,相比线性二次型调节器和深度确定性梯度下降方法,文中方法的闭环吸引域更宽广,实际指标函数也更小.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	陆超伦
	李永强
	冯远静

关键词 ： Q-学习, 数据驱动, 最优控制, 吸引域

Abstract：Q-learning algorithm is used to solve the optimal stabilization control problem while only the data, rather than the model of the plant, is available. Due to the continuity of state space and control space, Q-learning can only be implemented in an approximate manner. Therefore, the proposed approximate Q-learning algorithm can obtain only one suboptimal controller. Although the obtained controller is suboptimal, the simulation shows that the closed-loop domain of attraction of the proposed algorithm is broader and the cost function is also smaller than the linear quadratic regulator and deep deterministic policy gradient method for the strongly nonlinear plant.

Key words： Q-Learning Data Driven Optimal Control Domain of Attraction

收稿日期: 2018-10-11

ZTFLH:

TP 273+.1

基金资助:国家自然科学基金项目(No.61703369)、浙江省重点研发计划项目(No.2017C03039)、温州市重大科技专项项目(No.ZS2017007)资助

作者简介: 陆超伦，博士研究生，主要研究方向为强化学习、数据驱动控制等.Email:2111703050@zjut.edu.cn.李永强，博士，讲师，主要研究方向为数据驱动控制、最优控制等.E-mail:yqli@zjut.edu.cn.冯远静（通讯作者），博士，教授，主要研究方向为图像处理、智能优化等.E-mail:fyjing@zjut.edu.cn.

引用本文:

陆超伦, 李永强, 冯远静. 基于强化学习的数据驱动最优镇定控制及仿真[J]. 模式识别与人工智能, 2019, 32(4): 345-352. LU Chaolun, LI Yongqiang, FENG Yuanjing. Data Driven Optimal Stabilization Control and Simulation Based on Reinforcement Learning. , 2019, 32(4): 345-352.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201904008 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2019/V32/I4/345